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PR OC O LLACI A AS SKMBL ^ 

The present invention relates to a method of regulating assemblv of 
procollagens and derivatives thereof. 

Most cells, whether simple unicellular organisms or cells from human tissue, 
are surrounded h\ an intricate network of niacromolccules which is known as the 
extracellular matrix and which is comprised of a vanet\ of proteins and 
polysaccharides. The major protein component of this matrix is a family ot related 
proteins called the collagens which arc thought to constitute approximately 25% of 
total proteins in mammals. There are at least 20 genetically distinct types ol collagen 
molecule, some of which are known as fibrillar collagens (collagen types I. II. 111. V 
and XI) because they typical l\ form large fibres, known as collagen fibrils, that may 
be many mircometers long and may be visualised b\ electron microscopy. 



Collagen fibrils are comprised of polymers of collagen molecules and are 
produced by a process which imolves conversion of procollagen to collagen 
molecules which then assemble to form the polymer. Procollagen consists of a triple 
stranded helical domain in the centre of the molecule and has non-helical regions at 
the amino terminal (known as the N-terminal propeptide) and at the carboxy terminal 
(known as the C-terminal propeptide). The triple stranded helical domain is made up 
of three polypeptides which arc known as u chains. Procollagen is synthesised 
mtracellularly from pro-u chains <u chains with N- and C-terminal propeptide 
domains) on membrane-bound nbosomes following which the pro-u chains are 
inserted into the endoplasmic reticulum. 



Within the endoplasmic reticulum 
procollagen molecules This assembh can 
recognition event between the pro-u chains 
then a registration event which leads to 



the pro-u chains are assembled into 
be divided into two stages: an initial 
which determines chain selectively and 
correct alignment of the triple helix. 
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Procollagen assembly is initiated by association of the ('-terminal propeptide domains 
of each pro-u chain to form the C- terminal propeptide. Assembly ot the triple helix 
domain then proceeds in a C- to N- terminal direction and is completed by formation 
of the N- terminal propeptide The mature procollagen molecules are ultimately 
secreted into the extracellular environment where they are converted into collagen by 
the action of Procollagen N-Proicinases (which cleave the N-tcrmmal propeptide) and 
Procollagen (.'-Proteinases (which cleave the C-ierminal propeptide). Once the 
propeptides have been removed the collagen molecules thus formed are able to 
aggregate spontaneously to lorm the collagen fibrils 

( ollagens have main uses industrial!} . I -or instance. ( ollagen gels can be 
formed from collagen fibrils //: var<> and ma}' be used in support cell attachment. Such 
gels may be used in cell culture in maintain the phenotype of certain cells, such as 
chondrocytes explanted from cartilage. Collagen ma} be also used as a "stuffer" or 
packing agent surgical!} and is particularly known to be used in cosmetic surgery, for 
enlarging the appearance of hps lor instance. In vivo, collagen is a major component 
(if the extracellular matrix and serves a multitude of purposes. Numerous diseases arc 
known which involve abnormalities in collagen synthesis and regulation. Procollagens 
and derivatives thereof ma} be used (or be of potential use) for the treatment of these 
diseases. 

Large quantities ol procollagens or derivatives thereof need to be synthesised 
to meet increasing industrial demand. A convenient means of synthcsismg 
procollagens or derivatives thereof is b\ expression of exogenous pro-u chains m a 
host cell followed by the assembl} of pro-u chains into the procollagen or derivative 
thereof, for this to occur it is necessary to ensure that an}' host cell used has the 
necessary post-translational facilities required to assemble procollagens from pro-u. 
chains. This may be achieved by expression in cells which normall} synthesise 
procollagen. However one problem in such systems is that endogenously expressed 
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pro-u chains can co-assemble with the exogenously introduced pro-u. chains giving 
rise to undesirable h\ bnd molecules. 

In other circumstances il may be desirable to generate two or more 
procollagens from distinct pro-u chains ol an exogenous source in a host cell in which 
case it is required that co-assembly of pro-u chains to form undesirable hybrid 
molecules should not occur. 

It is also conceivable that procollagens ma> need to be assembled in a cell-tree 
system in vitro, in which case co-assembly of pro-u. chains giving rise to undesirable 
hybrid molecules also needs to be avoided 

h is an object of the present invention to provide a means by which pro-u 
chains or derivatives thereof may be assembled into desired procollagens or 
derivatives thereof without undesirable co-assembling with other pro-u chains. 

According to the present invention there is provided a method of producing a 
desired procollagen or derivative thereof m a system which co-expresses and 
assembles at least one further procollagen or derivative thereof wherein the genets ) for 
expressing pro-u chains or derivatives thereof for assembly into the desired 
procollagen has or have been exogenously selected from natural pro-u chains oi 
exogenousb manipulated such as to express said pro-u chains or derivatives thereof 
with domians which have the activity of C - terminal propeptide domains but which 
will not co-assemble with the C- terminal propeptide of the pro-u chains 01 
derivatives thereof that assemble to form the said at least one lurther procollagen o: 
derivative thereof'. 

B> "procollagen or derivative thereof* and % "pro-<x chain or derivative thereof 
we mean molecules ol procollagen or pro-u chains respectively that may be identical 
to those found in nature or may be non-natural derivatives which may be proteins or 
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derivatives of proteins. Non-natural derivatives may also have non-protein domains or 
even he entire]) a non-protein provided that the derivative contains a domain with 
activity of a C- terminal propeptide domain which will not co-assemble with the C- 
terminal propeptide domains of the pro-u chains or derivatives thereof that assemble 
to form at least one further procollagen or derivative thereof. 

Preferred pro-u chain derivatives comprise a domain with the activiiv of a C - 
terminal propeptide domain and a further douum which is at least partial!) capable ol 
trimerising to triple helix 

Thus the exogenously selected or cxo-enouslv manipulated penes ma\ express 
pro-u chains or derivatives thereof that ma\ Ik* assembled into tnmers u> form 
procollagen molecules or derivatives thcrco!. which m turn may he formed into 
collagen polymers following exposure to Procollagen C '-Proteinase and Procollagen 
N-Proteinases (which respectively cleave the ('- and N- terminal propeptides from the 
procollagen molecules to form monomers winch aggregate spontaneously to form the 
collagen polymers) The collagen polymer preferably a fibrillar collagen 

The invention is based upon the recognition by the inventors thai a crucial 
stage in the assembly of procollagens is an initial recognition step between pro-u 
chains which ensures that pro-u chains assemble in a type-specific manner I hi> 
recognition step involves a recognition sequence in the ('- terminal propetide domain 
ol pro-u chains. For instance, a single cell mav svnthesisc several collagen types and. 
therefore, several different pro-u chains, yet these chains are able to discriminate 
between T- terminal propetide domains to ensure type-specific assembh . One 
example of this discrimination can be found in cells expressing both type 1 and type 
III procollagen. Here at least three pro-u chains are synthesised. namely proalth. 
prou2(l) and prouldlli chains. However the only procollagens formed arc 
|procxl(I )J 2 procx2(h heterotnmers and [proul(III))^ homoirimers. Other combinations 
ot pro-u chains do not assemble into procollagens. 
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In - KT/GB96/02122 ( WO-A-97/083 1 1 ) the disclosure of which is 
incorporated by reference we have disclosed that specific regions within the C- 
terminal propeptide are the recognition sequences involved in the specificity of 
association between C-terminal propeptide domains of pro-u chains during the 
formation of procollagens. These recognition sequences were identified as having the 
following amino acid sequences lor each respective pro-u chain: 



pro- 


•a 1 ( I ) 
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pro- 


■u2 (I) 
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-u2 (XI) 
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These recognition sequences confer selectivity and specificity of pro-u chain 
association. 

In accordance with the invention, we have devised methods by which desired 
pro-u chains or derivatives thereof can be expressed and assembled into procollagens 
or derivatives thereof in a system which co-expresses and assembles pro-u chains or 
derivatives thereof of al least one further procollagen or derivative thereof without 
undesired co-assembly producing unwanted hybrid molecules. This is effected h\ 
exogenously manipulating or selecting the gene or genes that encode tor the desired 
pro-u chains or derivatives thereof such that the domains having C- terminal 
propeptide activity of these pro-cx chains or derivatives thereof that are expressed from 
the manipulated or selected gene or genes will not associate with (and therefore not 
co-assemble with) the domains having C- terminal propeptide activity of the pro-a 
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chains or derivatives thereof of the said at least one further procollagen or derivative 
thereof. Put alternatively, the domains having C- terminal propeptide activity of the 
pro-u chain or derivative expressed by the manipulated or selected gene are such that 
association between pro-u chains expressed from such a gene and association between 
at least one pro-u chain which forms the further procollagen or derivative thereof is 
mutually exclusive. 

Thus, in accordance with the present invention, a gene loi expressing a pro-u 
chain or derivative thereof for assembly into a desired procollagen may be 
exogenous!) selected or constructed to express a pro-u chain or derivative thereof 
comprised of (1) a first moiety incorporating at least the recognition sequence oi the 
C-terminal propeptide domain of a first type of pro-u chain, and (11) a second moiet). 
attached to the first moiety which will assemble into the desired procollagen. The 
second moiety preferably is at least partial!) capable of trnnensmg to form a triple 
helix. More preferably the second moiet) comprises at least some amino acids 
capable of trimensing with other u chains or derivatives thercoi The expressed 
molecule is one which has been "engineered" (by appropriate selection of the first and 
second moieties) such that it may be expressed and assembled in a system which co- 
expresses and assembles at least one further type of pro-u chain without undesirable 
formation of hybrid molecules. 

The domain having ('-propeptide activity expressed h\ the exogenous!) 
selected or modified gene may comprise a recognition sequence as listed above. The 
domain ma) be a modification (e.g. h\ substitution or deletion) ol such a recognition 
sequence, the domain retaining C'-propeptide activity. 

1 o prepare exogenous!) modified genes for use in the method of the invention, 
the DNA encoding for the desired recognition sequence max be substituted for the 
DNA encoding recognition sequences found in natural or artificially constructed pro- 
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a. chain genes u> form an exogenous!} modified gene for use in the method of the 
invention. 

DNA. particularly cDNA. encoding natural pro-u chains is known and 
available in the art. For example. \Y()-A- l >3(>788<). \V( )-A-s>4 1 6570 and the 
references cited in both of them give details. Such DNA ma\ be used as a convenient 
starting point for making a DNA molecule that encodes lor an exogenous!) 
manipulated gene for use in the invention. 

DNA sequences. cDNAs. full genomic sequences and minigenes (genomic 
sequences containing some, but not all. of the nitrons present in the lull length gene) 
ma\ be inserted by recombinant means into a DNA sequence coding for natural 1> 
occurring pro-u chains (such as the starting point DNA mentioned above) to form the 
DNA molecule that encodes tor an exogcnousl) manipulated gene for use according 
to the first aspect of the invention.. Because of the large number of mtrons present m 
collagen genes in general, experimental practicalities will usually lavour the use of 
cDNAs or, in some circumstances, minigenes The inserted DNA sequences. cDNAs. 
full genomic sequences or minigenes code for amino acids which give rise to pro-u 
chains or derivative thereof with a C- terminal propeptide domain which will not co- 
assemble with the C - terminal propeptide domain of the pro-u chains or derivatives 
thereof that assemble to form the said at least one lurther procollagen or derivatnc 
thereof. 

Preferred exogenous manipulations of the gene or genes involve alteration of 
the recognition sequence within the ('- terminal propeptide domain which i> 
responsible lor selective association of pro-u chains such that any pro-u chain or 
derivative thereof expressed from the manipulated gene will not undesirabh co- 
assemble with pro-u chains endogenous!} expressed from a host cell into which the 
exogenous!) manipulated gene or genes is or are introduced. 
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In our previous application PCT/GIW6/02122 ( \V()-A- ( )7/083 1 1 ) we disclosed 
novel molecules comprising combinations of natural or novel C- terminal propeptide 
domains with alien u chains (or a non-collagen material). PCT/(iB c >6 / 02 1 22 also 
disclosed DNA molecules encoding such molecules. These DNA molecules may be 
used according to the methods of the current invention. Such molecules disclosed in 
PCT/GB%/()2 1 22 are incorporated herein by reference. 

Alternatively deletion, addition or substitution mutations may he made within 
the DNA encoding for any one of these recognition sequences which alter selectivity 
and specificity of pro-u chain association. 

Other preferred exogenous manipulations of a gene involve the construction o! 
gene constructs which encode for chimeric pro-u chains or derivatives thereoi tonned 
from the genetic code of al least two difierent pro-a. chains. It is particularly preferred 
that the chimeric pro-u chains or derivatives thereof comprise a recognition sequence 
from the C- terminal propeptide domain of one type of pro-u chain and the u chain 
domain from another type of pro-u chain. Preferred chimeric pro-u chains o; 
derivatives thereof comprise the recognition sequence of a pro-u 1(1). pro-u2 (h. pro- 
ul (II). pro-u 1 (III), pro-u 1 (Y). pro-u2 (Y), pro-u 1 (XI) or pro-u2 (XI i pio-o 
chain and an (/.-chain domain selected from a different one of these pro-u cham^ 
Most preferred pro-u chains lor making chimeric pro-u chains or derivatives theieo! 
are those which form collagens I and III particularly pro-u2 (I) and pro-u 1 ( 1 1 1 > 
Specific preferred chimeric pro-u chains or derivatives thereof are disclosed in the 
Lxample. 

In a preferred exogenous manipulation of a gene according to the methods ol 
the invention, the DNA encoding tor the recognition sequence of the prou2(I) chain 
gene can be replaced with the corresponding DNA encoding for the recognition 
sequence of the proul(lll) chain gene and this manipulated gene can be expressed and 
assembled to form procollagens which are prou2(I) homotrimers (instead of 
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proa 1 (III) homotrimers which would normally be formed from pro-u chains 
containing -these recognition sequences). Thus according to the invention procx2(I) 
homotrimers derived from an exogenous source may be formed which do not co- 
assemble with pro(x2(I) chains endogenous to the cell in which expression occurs 
which have fc 'nauirar recognition sequences. 

In another preferred exogenous manipulation of a gene according to the 
methods of the invention, the manipulated gene encodes for a molecule comprising at 
least a first moiety having the activity of a procollagen ( -propeptide (i.e. the C - 
terminal propeptide domain of a pro-cx chain) and a second moiety selected from any- 
one of an alien collagen u chain and non-collagen materials, the first moiety being 
attached to the second moiety. Genes which encode lor a second moiety of a non- 
collagen materia! (such as those disclosed in PCT/GBW>'()2 1 22 > are examples of pro- 
cx chain derivatives for use according to the invention. 

Alternatively the gene or genes may be selected from naturally occurring 
genes such that the recognition sequence within the ('- terminal propeptide domain 
which is responsible for selective association of pro-cx chains such that any pro-cx 
chain expressed from the selected gene will not undesirably co-assemble with pro-cx 
chains endogenous!) expressed from the host cell into which the gene or genes is or 
are introduced 

The exogenoush selected or modified gene may be incorporated within a 
suitable vector to form a recombinant vector. The vector may lor example be a 
plasmid, cosmid or phage. Such vectors will frequently include one or more 
selectable markers to enable selection of cells translected with the said vector and. 
preferably, to enable selection of cells harbouring the recombinant vectors that 
incorporate the exogenously modified gene. 
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For expression of pro-cx chains or derivatives thereof the vectors should be 
expression, vectors and have regulator)' sequences to drive expression of the 
exogenously modified gene. Vectors not including such regulatory sequences may 
also be used during the preparation of the exogenously modified gene and are useful 
as cloning vectors for the purposes of replicating the exogenously modified gene. 
When such vectors are used the exogenously modified gene will ultimately be 
required to be transferred to a suitable expression vector which may be used for 
production of the pro-a chains or derivatives thereof. 

The system in which the exogenously selected pro-u chain(s) or exogenously 
manipulated gene or genes of the method of the invention may be expressed and 
assembled into procollagen or derivatives thereof may be a cell free in vitro system. 
However it is preferred that the system is a host cell which has been transfected with a 
DNA molecule according to the second aspect of the invention. Such host cells may 
be prokaryotic or eukaryotic I ukaryotic hosts may include yeasts, insect and 
mammalian cells. Hosts used lor expression of the protein encoded by the DNA 
molecule are ideally stably transformed, although the use of unstably transformed 
(transient) hosts is not precluded. 

Alternatively a host cell s\ stein ma\ involve the DNA molecule being 
incorporated into a transgene construct which is expressed in a transgenic plant or. 
preferably, animal. Transgenic animals winch may be suitabh formed for expression 
of such transgene constructs, include birds such as domestic fowl, amphibian species 
and fish species. Procollagens or derivatives thereof and / or collagen polymers 
formed therefrom may be harvested from body fluids or other body products (such as 
eggs, where appropriate). Preferred transgenic animals are (non-human) mammals, 
particularly placental mammals. An expression product of the DNA molecule of the 
invention may be expressed in the mammary gland of such mammals and the 
expression product may subsequently be recovered from the milk. Ungulates, 
particularly economically important ungulates such as cattle, sheep, goats, water 
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buffalo, camels and pigs arc most suitable placental mammals lor use as transgenic 
animals according to the invention. Lqually the transgenic animal could be a human in 
which case the expression of the pro-cx chains or derivative thereof in such a person 
could he a suitable means of e fleeting gene therapy. 

Host cells and particularly transgenic plants or animals, may contain other 
exogenous DNA. the expression of which facilitates the expression, assembly, 
secretion or other aspects of the biosynthesis of procollagen and derivatives thereof 
and even collagen polymers formed therefrom. For example, host cells and transgenic 
plants or animals may also be manipulated to co-express prolyl 4-hvdroxylusc. which 
is a post translation enzyme important m the natural biosynthesis of procollagens, as 
disclosed in \V( )-A-93078X^. 

The methods of the invention enable the expression and assembly of any 
desired procollagen or derivative thereof in a system in which conventional!) there 
would be undesirable co-assembly or hybridisation of pro-u chains. The methods are 
particularly suitable for allowing the expression of procollagen or derivali\cs thereof 
from a wide variety of cell-lines or transgenic organisms without the problems 
associated with co-assembly with endogenously expressed pro-u chains. A preferred 
use of the methods of the invention is the production of recombinant procollagens in 
cell-lines, hxamples of cell-lines which may be used arc fibroblasts or cell lines 
derived therefrom. Baby Hamster Kidney cells (BHK cells). Mouse y\'} cells. 
Chinese 1 lamster Ovary cells (CI K ) cells) and COS cells may he used 

The methods of the invention are particularly useful as an improved means ol 
production of any desired procollagen or derivatives thereof, particularly for scaled up 
industrial production by biotechnological means. 

The method of the invention may also be useful for treatment by gene therapy 
of patients suffering from diseases such as osteogenesis imperfecta (OIL some forms 
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of Hhlers-Danlos syndrome (I:I)S) or certain forms of chondrodysplasia. In most 
cases the devastating effects of these diseases are due to substitutions of glycine 
within the triple helical domain, for amino acids with bulkier side chains in the pro-o 
chains. This substitution results in triple helix folding, during the formation of 
procollagen, being prevented or delayed with the consequence that there is a drastic 
reduction in the secretion of the procollagen. The malfolded proteins are retained 
within the cell, probably within the endoplasmic reticulum, where they are degraded, 
f urthermore, the folding of the (- terminal propeptide domain is not affected by these 
mutations within the triple helical domain, therefore ("-terminal propeptide domains 
from normal as well as mutant chains may associate resulting in the retention of 
normal and mutant pro-u chains within the cell. The retention and degradation of 
normal chains due to their interaction with mutant chains amplifies the effect of the 
mutation and has been termed "procollagen suicide**. The massive loss of protein due 
to this phenomenon probably explains wh\ such mutations produce lethal effects. 
Identification by the inventors of the recognition sequence which directs the initial 
association between pro-u chains provides a target for therapeutic intervention 
allowing for the modulation or inhibition of collagen deposition. Thus, the method of 
the invention could be utilised as a gene therapy to transfer a copy of the wild-type 
gene to an individual with a mutation in the triple helical domain such that the wild- 
type gene is exogenously manipulated to code for a pro-u chain with a C- terminal 
propeptide domain that will not co-assemble with the mutant pro-u chains. The 
patient is then able to secrete authentic collagen chains in cells expressing mutant 
chains 

The present invention will now be described, by way of example with 
reference to the accompanying drawings, in which: 

f igure 1 is a schematic representation of the stages in normal procollagen 
assembly (A) and stages in procollagen assembly according to one embodiment of the 
invention (B ); 
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Figure 2 shows an alignment plot of the C - terminal propeptide domains of pro- 
ex chains from type I and III collagen. The alignment shows amino acids which are 
identical (#) or those which are conserved (-■). The conserved cysteine residues are 
numbered 1-8. while letters A. B. C. F. (i denote the first ammo acid at the junctions 
between proul(III) chains and proa2(I) chains of the 1 sample; 

Figure is a schematic representation ot" the chimeric pro-cxl chains described 
in the Example; 

Figure 4 is a photograph of an SDN-FACT gel. illustrating disulphide bond 
formation among chimeric gene constructs in which the ('-terminal propeptide domain 
were exchanged, with the following parental and chimeric molecules from the 
Example run in the indicated lanes of the gel Proc/I dlh.M |ol(IIh|. prot/.2(l)Al 
|<x2<I)l (parental molecule) and prou2( 1 ):( III )(T |u2(T|. prou 1 < III > ( I >(T |aFCP] 
(hvbrid chains), these molecules were expressed in a rabbit reticulocyte lysate in the 
presence of semi-pcrmeabili/.ed (SIM III 10X0 celF. alter which the SI'-eells were 
isolated by centrifugation. solubilizcd and the translation products separated by SDS- 
PAGE through a 7.5% gel under reducing (lanes 1-4) or non-reducing conditions 
(lanes 5-8); 

Figure 5 is a photograph of an SDS-lWGi gel the lanes represent the effect ot 
heat denaturation of procx2( 1 >.( Ill KT tnplc-hchx at the specified temperature.-, the 
samples were prepared in the following manner: Prou2( 1 ):( III )CP RNA was translated 
in the presence of SP-cells. after which the SP-cells were isolated b\ centrilugation. 
solubilized and treated with pepsin (100 pg/ml). the reaction mixture was neutralized, 
diluted in ehvmotrypsin trypsin digest buffer and dixided into ahquots. each aliquot 
being heated to a set temperature prior to digestion with a combination of trypsin ( 100 
ug ml) and chymoirypsin (250 )ag/ml), samples were analysed by SDS-PAGE through 
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a 12.5% gel under reducing conditions (lanes 1-10). Lane 11 (unt) contains 
translation products which have not been treated with proteases; 

Figure 6 is a photograph of an SDS-PAGE gel illustrating trimerization and 
triple-helix formation among chimeric procollagen chains, samples were prepared 
from parental chains procxlfUDAl, prou2(I)Al which were made into hybrids 
prou2(I):(III)CI\ A,F.F"" ( . Proa 1 (III ):(I)C fa2CI\ A.F.F Vc ,B S c ,C Vt . alO, the 
hybrids were translated in a rabbit reticulocyte lysate in the presence of SP-cells alter 
which the SP-cells were isolated by centrifugation. solubilized and a portion of the 
translated material separated by SDS-PAGH under non-reducing conditions through a 
7.5% gel (lanes 1-9). 

Figure 7 is a photograph of an SDS-PAGE gel illustrating trimerization and 
triple-helix formation among chimeric procollagen chains, lanes show the remainder 
of the samples that were loaded on the gel of I : ig 6 which were treated with pepsin 
(100 ng mli prior to neutralization and digestion with a combination of trypsin (100 
ug/ml) and chymotrypsin (250 ng/ml). the proteolytic digestion products were 
analysed by SDS-PAGF through a 12.5% gel under reducing conditions (lanes 1-0); 

Figure X is a photograph of an SDS-PAGF gel, illustrating trimerization and 
triple-helix formation among chains containing the 23 amino acid B-G motif, the 
lanes show recombinant procollagen chains proa 1 (III ):(1)CI\ proa2( I ):( I II )CP and 
prou2(I ):( III )BGR S( which were expressed in a reticulocyte lysate supplemented with 
SP-cells. alter which the SP-cells were isolated by centrifugation. solubilized and a 
portion of the translated material separated by SDS-PAGK through a 7.5% gel. under 
reducing ( lanes 1 -3 ) of non-reducing conditions (lanes 4-5). 

Figure 0 is a photograph of an SDS-PAGF gel. illustrating trimerization and 
triple-helix formation among chains containing the 23 amino acid B-G motif, the 
lanes show the remainder of the samples that were loaded on the gel of Figure 0 which 
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were treated with pepsin (100 ug/ml) prior to neutralization and digestion with a 
combination of trypsin (100 ug'ml) and chymotrypsin (200 }.ig/ml), the proteolytic 
digestion products were analysed by SDS-PAGb through a 12.5% gel under reducing 
conditions ( lanes 1-3): 

Figure 10 is a photograph of an SDS-PAGh gel. illustrating the effect of C'ys- 
Ser reversion and Leu-Met mutation on the assembly of prou2(I ):( II 1 )BGR chains, the 
lane show recombinant procollagen chains prou2(I > : < 1 1 1 )BGR S ( pro<x2i 1 ):( 1 11 >BGR ( " 
\ proa2( I ):( III )H(jR 1 ' m1 which were translated m a reticulocyte lysate supplemented 
with SP-cclls after which the cells were isolated by centrifugation. solubihzed and a 
portion of the translated material separated by SDS-PAG1 through a 7.5% gel. under 
reducing (lanes 1 -3 ) or non-reducing conditions ( lanes 4-6): 

Figure 1 1 is a photograph of an SDS-PAGh gel. illustrating the effect of Cys- 
Ser reversion and Leu-Met mutation on the assembly of prou2(I ):( III }IKiR chains, the 
lane show the remainder of the samples that were loaded on the gel of l ie M) which 
were treated with pepsin (100 ug'ml) prior to neutralization and digestion with a 
combination of trypsin (100 p.g/ml) and a chymotrypsin (250 pg< ; mlh the proteolytic 
digestion products were analysed by SDS-PAGH through a 12.5% gel under reducing 
conditions (lanes 1-3): 

f igure 12 is a photograph of an SDS-PAGh gel. illustrating inter-chain 
disulfide bonds from between prou2( I ):( III )BGR ('-terminal propeptide domains, the 
lanes show recombinant pro-u chains procx 1 (III )A1 and pro<x2( I ):( I II )BGR which 
were translated in a reticulocyte lysate supplemented with SP-cells. The cells were 
isolated by centrifugation. solubihzed and digested with 1.5 units of bacterial 
collagenase. The products of digestion were analysed by SDS-PAGb through a I0°n 
gel under reducing (lanes 2 and 3) or non-reducing (lanes 4 and 5} conditions: and 



YVO 98/38303 




PCT/GB98/00468 



figure \? is a schematic representation of sequence alignment of the chain 
selectivity recognition domains in oilier fibrillar procollagens, sequence homology 
within the 23 residue B-(i motif is illustrated, the boxed regions indicating the position 
of the unique 15 residue sub-domain which directs pro-u chain discrimination. 

f igure 1 illustrates how procollagen is assembled in the endoplasmic reticulum 
ot a cell. Normally assembly is initiated b\ type specific association of ( -terminal 
propeptide domains of complimentary pro-a chains (1} to form procollagens (2). 
Procollagen is secreted from the cell in which it is synthesised and is then acted upon b\ 
Procollagen N Proteinases and Procollagen (' Proteinases which cleave the N-temiinal 
propeptide and (/-terminal propeptide respectively to yield collagen molecules (3) 
Collagen molecules may then spontaneous!) aggregate to lorm collagen fibrils Pro-o 
chains with non-complimentary ('-terminal propeptide domains (4) do not associate 
and lorm procollagens. When exogenous pro-u chains (5) arc introduced into a cell 
they ma\ co-assemble with endogenous pro-u chains ((>) which have complimentary 
('-terminal propeptide domains to form undesirable hybrids (7) According to the 
methods ol the invention exogenous!) manipulated pro-u chains (8>are generated with 
( - terminal propeptide domains that are no longer complimentary to the ('-terminal 
propeptide domains of the endogenous pro-o chains ((>) such that the exogenous!) 
manipulated pro-cx chains (8) may form procollagens ( ( M and subsequently collagen 
molecules (10) without co-assembly with endogenous pro-u chains ((>} oecurrine. 
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EXAMPLE 

The inventors generated DNA molecules which may he used according to the 
methods of the'invention. These DNA molecules were used to express pro-u. chains 
with altered selectivity for pro-(x chain assembly, l:\perimental strategy was based on 
the assumption that transfer of C- terminal propeptide domains (or sequences within 
the ('-propeptide) from the homotrimeric proa 1 (III) chain to the prou2(I) molecule 
would be sufficient to direct self-association and assembly into homotnmers ol 
pnxx2(I). The inventors reconstituted the initial stages in the assembly of procollagen 
by expressing specific RNAs in a cell-free translation system in the presence of semi- 
permeabilized cells known to carry out the co- and post-iranslational modification 
required to ensure assembly of a correctly aligned triple helix. By analysing the 
folding and assembly pattern of procollagens formed from a series of chimeric pro-</ 
chains in which specific regions of the ('-terminal propeptide domain of proul (III) 
were exchanged with the corresponding region within the prou2(I» chain (and vice 
versa) the inventors identified a short discontinuous sequence of 15 amino acids 
within the proul (III) ('-propeptide which directs procollagen self-association. This 
sequence is. therefore, responsible for the initial recognition event and is necessary to 
ensure selective chain association. 

1. MATERIALS AM) METHODS 

LI Construction of recombinant p/asmids 

puldlhAl and pu2(I)\l are recombinant pro-u chains with truncated a chair, 
domains which have been described previously (see Lees and Bulleid ( 1 c > l M i J Biol 
Chem 26 ( ) p24354-24 3601 t > t >4 ) Chimaenc molecules were generated h\ PCR 
overlap extension using the principles outlined by Horton ( 1 W3 ) Methods in 
Molecular Biolog\ Vol 15, Chapter 25, Humana Press Inc.. 'lotovva. NJ. PCRs 
MOOfal) compromised template DNA (500 ng). ohgonucleotidc primers (100 pmol 
each) in 10 mM KLCL 20 mM Tris-HCl pi 1 8.8. lOmM (NH 4 ) 2 SO tl , 2 mM \lgSO«. 
0.1% (v/v) Triton X-100. 300 uVl each dNTP. Ten rounds of amplification were 
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performed in the presence of 1 unit Vein DNA polymerase (New Kngland Biolabs. 
MA). Recombinants pu2( I )A 1 : < 1 1 1 )Cl\ A, I\ S s ~\ C were generated using a 5' 
oligonucleotide primer (5 AG Al GGTCXiCAC TGCiACATU 3') complementary to a 
sequence 70 bp upstream of an Sfil site in pcx2(l)Al and a 3' oligonucleotide primer (5 1 
TCXiCAGGGATCCXi'l IX iGTCACTTGCACTGGTT 3') complementary to a region 
100 bp downstream to the stop codon in ptx 1 f II I >A 1 . A /n//;;lll site was introduced 
into this primer to facilitate subsequent sub-cloning steps. Pairs of internal 
oligonucleotides, of which one included a 20 nucleotide overlap, were designed to 
generate molecules with precise junctions as delineated (see Figs 2 and 3> Overlap 
extension yielded a product of -WO bp which was purified, digested with XluA- 
fiamUl and ligated into pu2<I)Al from which a HlKu bp Xho\-l>am\ 1! fragment had 
been excised. Recombinants p<i I < II I )A I :< I >(TX were .synthesized in a similar 
manner using a 5' oligonucleotide (>' A ATGG AG( "IX I* I GG A(X X\<\T( i 3') 
complementary to a sequence lOObp upstream of an Xho\ s:te in a pc/dlhA! and a 3' 
amplification primer (5 X'TGG 1 AGGTAGGAAA'l (i( .AAGGA I I I AGG TTT 3') 
which incorporated a Kpn\ site and was complementary to a region lOObp 
downstream of the stop codon in ptx2( I )AI . Overlap extension produced a fragment 
of 1 100 bp which was digested with Xlu>\ and Kph\ and ligated into puKIIhA from 
which an 1X60 bp fragment had been removed. Recombinant pa2(I ):( IIDBGR was 
constructed using the same amplification primer used to synthesize the 
prou2( 1 )A 1 XIII ) series ot chimeras and a 3' oligonucleotide w hich was identical to 
that used to generate the prou 1 ( III ) \1 :< I )GI\G constructs except that it contained a 
IhmilW site instead of Kpn\ (both complementary to pu2il)Al). Primary 
amplification products were generated from pu2( I )A 1 :( llhlf ' and pu2(I)Al with 
internal oligonucleotides determining the junction. ( )verlap extension produced a 
fragment which was digested with Sfil and Ham\\\ and ligated into pci2(l)Al. Site- 
directed mutagenesis was performed essentially as described by Kunkel ct al. (Kunkel 
ci al. (1 ( )87) Methods in Enzymol. 154 p 367-382), except that extension reactions 
were performed in the presence of 1 unit T4 DNA polymerase and 1 ug T4 gene 32 
protein (Boehringcr. Lewes. UK). 
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1.2 Transcription in vitro 

T ranscription reactions were carried out as described by Ciurevich ct ul. ( 1987) 
(see Gurevich et ul (1991) Anal. Biocheni. 195 p207-213) . Recombinant plasmids 
paldlDAL p(xl(III)Al:(I)CI\C and pu2(I)Al. pa2(l )A1 :(I11 )CP, A, F. T s "\ B s \ C w 
<10|ag) were linearized and transcribed using T3 RNA polymerase, or T7 RNA 
polymerase (Promega. Southampion. UK) respectively. Reactions (100 ul) were 
incubated at 37°C for 4 h Following purification over RNeas\ columns (Qiagcn. 
Dorking. UK), RNA was resuspended in 100 fal RNasefrcc water containing 1 mM 
DTT and 40 units RNasin (Promega. Southampion. UK). 

L3 Translation in vitro 

RNA was translated using a rabbit reticulocyte b sate ( 1 lexil > sate. Promega. 
Southampton) for 2 hours at 30°(" m the absence of exogenous DTI . The translation 
reaction (25ul) contained 17j.il reliculoc\ te lysalc. 1 yi\ 1 mM amino acids (minus 
methionine). 0.45 ul lOOmM KC1. 0.25 ul ascorbic acid (5 mgml), 15 ^iCi |i- 
°S|mcthionine (Amcrsham International. Bucks, UK). 1 |il transcribed RNA and 1 ul 
(-2 x 10^) semi-pcrmcabihzed cells (SP-cells) prepared as described by Wilson ci ul. 
( 1995) Biochem. J. 307 p67 c >-687. After translation. A'-cthylmalcimide was added to 
a final concentration of 20 mM. SP-cells were isolated by centntugation in a 
microfugc at 10000 g for 5min and the pellet resuspended m an appropriate buffer lor 
subsequent en/ymic digestion or gel electrophoresis 

L4 Bacterial collagenasc digestion 

SP-cells were resuspended in 50 mM d ris-1 KM pi 1 7 4 containing 5 mM 
C'aCU 1 mM phenylmcthanesulfonyl lluoride (PMSFh 5mM A-ethylmaleimide and 
1°,. (v/v) Triton X-100 and incubated with 3 units collagenase form III (Advance 
Biolacture. Pynbrook. NJ) and incubated at 37 C C for lh. The reaction was terminated 
by the addition of SDS-PAGP sample buffer. 
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7.5 Proteolytic digestion 

Isolated SP-cclls were resuspended m 0.5 ( \> (v/v) acetic acid. 1% (v/v) I riton 
X-100 and incubated with pepsin (100 ugmb for 2 h at 2()°C or 16 h at 4°C\ The 
reactions were stopped by neutralization with Tris-base (100 mM). Samples were 
then digested with a combination of chymotrypsm (250 |ag/ml) and trypsin (100 
^ig'ml) (Sigma. Poole. Dorset. UK) for 2 mm at room temperature in the presence ol 
50 mM Tris-HCI pH 7.4 containing 0.15 M \u(/l. 10 mM FDIA. The reaction* 
were stopped by the addition of soy bean trypsin inhibitor (Sigma. Poole. Dorset. I K i 
to a final concentration of 500 (ag'ml and boiling SDS-PAGT loading buffer. Samples 
were then boiled for 5 min. 

1.6 Thermal denaturatioti 

Pepsin-trealed samples were resuspended in 50 mM 'Iris-HU pi 1 7.4 
containing 0.15 M NaCl, lOmM KD I A. and aliquots placed in a thermal cycler. A 
stepwise temperature gradient was set up from 31 C C to 40°C with the temperature 
being held for 2 min at 1°C intervals. At the end of each time period the sample wa> 
treated with a combination of chymotrypsin. as described above. 

L 7 SDS-PA GE 

Samples resuspended in SDS-l'ACrl loading buffer (0.0625 M Tris-IIU pH 
OX. SDS (2°o w v). glycerol (l()"i. \ \ i and Uromophcnol Blue) in the presence or 
absence of 50 mM DTI and boiled lor 5 mm SDS-PAGP was performed using the 
method of Laemmh (1070) Nature 22^ poX0-6K5. After electrophoresis, gels were 
processed for autoradiography and exposed to kodak X-()mat AR film, or images 
quantified by phosphoimagc analysis. 
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2. RESULTS 



2A Transfer of the proal (III) C-propeptide to the proa.(I)2 chain is sufficient to 
direct self-assembly. 

Experimental strategy was based on the assumption that transfer of the (- 
termmal propeptide domain from the prou 1( III) chain to the prou2(l) chain should he 
sufficient to direct self-recognition and assembly into honiotrimers. llencc. b\ 
exchanging different regions within the prouHIIh (- terminal propeptide domain 
with the corresponding sequence from the prou2(P chain the intention was to 
distinguish between sequences that direct the loLline ol tertiary structure and those 
involved m the selection (i.e. recognition o! pro-u chains) process I o simplif) 
analysis of the translation products chimeric procollagen molecules were constructed 
from two parental procollagen 'mini-chains', prou 1 ( 1 1 1 )A1 and prou(I)Al. These 
molecules, which have been described prcviousl\ (Lees and Bulleid. l l ' l M). comprise 
both the N- and (_'- terminal propeptides domains together with truncated triple-helical 
domains. The initial assumption was tested b\ anahsmg the lolding and assembly of 
chimeric procollagen chains in which the (/-terminal propeptide domain of the 
proa2(I) chain was substituted with the equivalent domain from the prou 1 ( 1 1 1 )A 1 
chain (pro(x2-(I ):(II1)CP ) and. converse!) . where the ( -propeptide of prou 1 ( III ) chain 
was replaced with that from prou2(I)Al chain (prou I < 1 1 h:( I )( T » (see Figs 2 and 3). 
The (/-propeptide (CP) junction points were determined by the sites of cleavage h> 
the procollagen ('-proteinase <P(T) which is known to occur between Ala and Asp 
(residues 1 1 PM 120) in the prou2(l ) chain (Kessler ( 1 ( >^(>> Science 2^1 p3od-V>2). In 
the absence of data regarding the precise location ol cleavage within the proudlh 
chain, the inventors chose to position the junction between Ala and Pro (residues 
1217-1218). However. Kessler and co-workers (! ou 6> have subsequently shown that 
cleavage by PCP occurs between (ily and Asp (residues 1222-1223). with the 
consequence that recombinant prou2( I ):( III )CP includes an additional four residues 
derived from the proof III) C-telopeptide, whilst the C'-telopeptide in construct 
prou 1 (III ):(I)CP is missing those same four amino acids. RNA transcripts were 
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transcribed in vitro and expressed in a cell-free system comprising a rabbit 
reticulocyte lysate optimized for the formation of disulfide bonds supplemented with 
semi-permeabilized FIT 1080 cells (SP-cells), which has been shown previously to 
carry out the initial stages in the folding, post-translationai modification and assembly 
of procollagen (Bulleid et aL, (1096) Biochem. J. 317 pl ( >>-202). The C-terminal 
propeptide domains of both prou 1(111) and proot2(I) chain:, contain cysteine residues 
which participate in the formation of interchain disulfide bonds. I ranslation products 
were, therefore, separated by SDS-PACiF. under reduced and non-reduced conditions 
in order to detect disulfide-bonded tnmcrs. Translation ol the parental molecules 
procxKIIDAl and prou2(I)Al yielded major products of -77 U )a and 61 kl)a 
respectively (f igure 4. lanes 1 and 2}. the si/.e differential being accounted tor by the 
relative molecular weights of the N-propeptides and truncated tnplc-helical domains 
in each molecule (Lees and Bulleid. 1004). The heterogenous of the translation 
products is due to hydroxylation of proline residues in the tnpie-hehcal domain that 
leads to an alteration in electrophoretic mobility (Cheah nl.. (1070) Biochem. 
Biophys. Res. Comm. ^1 pl025-K)3h. The additional lower molecular weight 
proteins present in lanes 3 and 7 probably represent translation products obtained alter 
initiation of translation at internal start codons. We have previously shown that these 
minor translation products are not translocated into the endoplasmic reticulum (Lees 
and Bulleid, 1004). The presence of high molecular weight species under non- 
reducing conditions but not reducing conditions is indicative ot interchain disulfide 
bond formation Separation under non-reduced conditions revealed that procxldlhAl. 
but not proud )AL chains were able to self-associate to form disulfide-bonded tnmcrs 
(Figure 4, lanes 5 and 6). A similar examination of chimeric chains prou2( h:( 1 1 1 )CT 
and proal(III):(I)CT revealed thai only prou2(I):(IHKT chains were able to form 
disulfide-bonded homotnmers (Figure 4. lanes 3, 4. 7 and 8) demonstrating that the ( - 
propeptide from type III procollagen is both necessary and sufficient to drive the 
initial association between procollagen chains 
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It has been shown previously that procx 1 (III )A1 chains synthesised in the 
presence of SP-cells were resistant to a combination of pepsin, chymotrypsin and 
trypsin in a standard assay used specifically to detect triple-helical procollagen 
(Bulleid et ul.. 1996). The inventors confirmed that prou2(I ):(II1 )CT chains had the 
ability to form a correctly aliened triple-helix by performing a thermal denaturation 
experiment in which translated material was heated to various temperatures prior to 
protease treatment (f igure 5). The results indicate that at temperatures below 35°C a 
protease-resistant triple-helical fragment is present, but at temperatures above 35°C 
the triple-helix melts and becomes protease sensitive (figure 5. lanes 1-10). I he 
melting temperature (7 in ) was calculated to be -35.5°C after quantification bv 
phophorimage analysis. The '/„, value obtained tor proa2-( 1 ):(I11 )CT is significantly 
lower than the figure oi 3 c f5°C obtained lor proa 1 (III )AI (Bulleid ci a!.. \Wh) and 
probably reflects the percentage of hydrox\ proline residues relative lo the total 
number of amino acids in the triple-helical domain (ll°o and 15'\. respectively). 
These results indicate that transfer of the proadll ) ('-propeptide enables the inventors 
to generate an entirely novel procollagen species comprising three proa2(h chains 
that fold into a correct!} aligned triple-helix. 

2.2 Assembly of recombinant procollagen chains with chimeric C-propeptides. 

Given that the prou2( I ):( III )CP hybrid pro-u chain includes all of the 
information required for self-association we reasoned that progressiv e removal of the 
proul(III) ( -propeptide sequence and replacement with the corresponding pro</.2(h 
sequence would eventually disrupt the chain selection mechanism. Converse!}, it is 
anticipated that transfer or progressive!} more procx 1 ( U I > ('-terminal propeptide 
domain sequence to the proa 1 (1 1 1 ):( 1 )CP chimeric chain would yield a molecule 
which was capable of sell-assembly . A series of procollagen chains with chimeric C- 
terminal propeptide domains was constructed and the ability of individual chains to 
form homotnmers with stable triple-helical domains was assessed. .A schematic 
representation of these recombinants is presented in figure 2, with the letters A. B. C. 
f and G denoting the position of each junction. It should be noted that the proa 1(111 ) 
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and proa2(I) ('-propeptides differ in their complement of cysteine residues, with 
procx2(l) lacking the Cys2 residue. Our previous data suggest that interchain disulfide 
bond within the ('-propeptide of type III procollagen form exclusively between C\s2 
and 3 (Lees and Hulleid. 1994). However, interchain disulfide bonding, between 
either the ('-terminal propeptide domains to C-telopeptides is not required lor chain 
association and triple-helix formation (Bulleid el a!.. 1^9ol therefore, it is possible 
that homotnmers may form between chimeric pro-u chains which lack either the ('- 
terminal propeptide domain Cys2 residue or the C-telopeptidc cysteine (only tound in 
the triple-helical domain ot "proa. 1 ( III )| These molecules will not. however, contain 
interchain disulfide bonds and. as a consequence will not appear as oligomer alter 
analysis under non-reducing conditions. '1 o circumvent this problem, where 
appropriate, the inventors generated their hybrid chains from a recombinant 
procx2{ I )A (Lees and Bulleid. 1 994 ) in which the existing serine residue wa^ 
substituted for cysteine, thus restoring the potential to form tnmers stabilized by 
interchain disulfide bonds It should also be noted that whilst pnxx 1 ( III }:( 1 K 'V lack^ 
(ys2. it does still retain the potential lo lorm disulfide-bonded tnmers by viriue o! the 
two cysteine residues located at the junction of the triple-helical domain and the 
telopeptide. Parental chains pro(x2(l)Al and hybrids protx2{ I ):( II 1 )CP, A, 1 . T . M \ 
C s *\ proix 1 (III):( I )C were translated in the presence of SP-cells and the product 
separated by SDS-PA(iL under non-reducing conditions (f igure 6). 'I he icsiiit- 
demonstrate that recombinants prouldlhAL pro<x2( 1 ).( Ill )(T\ A. } \ IT (i uurc o 
lanes L 3. 4. 6 and 7} are able to lorni interchain disulfide-bonded tnmers and. dmici ^ 
while protx 1 ( III )AL pnnx2( I ) ( III )I . and prou 1 (II 1 ).( I )C (} : igure <>. lane- 2. ^. S 
and 9) remain monomeric. We have already demonstrated that interchain disulfide 
bonding is not a prerequisite for triple-helix formation (Bulleid el aL. 1996). thereiore. 
the inability to form disulfide-bonded tnmers does not preclude the possibility that the 
molecules assemble to form a triple-helix. To ascertain whether the chimeric chains 
had the ability to fold into a correctly aligned triple-helix, we treated translation 
products with a combination of pepsin, chymotrypsm and trypsin and analysed the 
digested material under reducing conditions by SDS -PAGL. As shown in f igure "\ 
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recombinants proa] (III )A 1 , procx2( I ):( III )(T, A. F s \ I ; . B w (Figure 7, lanes 1. 3. 4. 5. 
6 and 7) all yielded protease-resistant fragments. The size differential reflects the 
relative lengths of the triple-helical domains in each of the parental molecules 
lprocx2(l)Al -185 residues and procx 1 ( II I )A 1 -1 L )2 residues]. The ability of 
prooc2( I ):( 1 1 1 )F to form a stable triple-helix confirms lhat interchain disulfide bonding 
is not neeessar> lor triple-helix folding. Thus, hybrid molecules containing sequences 
from the prou2 ('-terminal propeptide domains between the propeptide cleavage site 
and the B-junction are able to form homotrimers with stable triple-helical domains 
and. therefoie. contain ail of the information necessar\ to direct chain self-assembly 
These results indicate that the signal(s) which controls chain selectivity must be 
located between the B-junction and the ('-terminus of the ( -propeptide. Neither 
pr<K/.2( I ) : ( 1 1 1 )( nor prou 1 ( III ):U )(' chains arc able to told into a triple helix. 'I lie 
inability of these reciprocal constructs to sell-associate suggests that chain selectivity 
is mediated, cither by a co-linear sequence that spans the C-iunction or h\ 
discontinuous sequence domains located on either side ol the (/-junction. 

2.3 Identification of a sequence motif from the proa I (III) ( -propeptide which 
directs chain self-assembly 

Procollagen chain selectivity is probably mediated through one or more of the 
variable domains located within the ('-terminal propeptide domain. The sequence 
between the B- and C-junciions is one ot the least conserved among the procollagen 
('-propeptides (f igure 2). yet to inventors ha\e demonstrated that inclusion of thi^ 
domain, in the absence of" procx 1(111) sequence distal to the C'-iunction. is not 
sufficient to direct chain assembly. To ascertain whether the recognition sequence lor 
chain recognition had indeed been interrupted a further recombinant. 
prou2(I ):(III)B(iR s " e (B-d replacement) was generated, which contained all of the 
pro(x(I)Al sequence apart from the Ser— >( ys mutation at Cvs2 and a stretch of 23 
amino acids derived from the type III ('-propeptide which spans the ('-junction from 
points B to (i. the B-(i motif: h GNPF:i.PE nV LnV L QLAFLRLLSSR F (underscoring 
indicates the most divergent residues, see Figure 2). The location of the (i-boundar\ 
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in the replacement motif allowed for the inclusion of the first non-conserved residues 
alter the ("-junction (SR). When expressed in the presence of SP-cells the chimeric 
proa2(l):(III)B(fI\ s " chains were able to form inter-chain disulfide-honded molecules 
(Figure 8, lane h) demonstrating that the ('-terminal propeptide domains were capable 
of self-association. Furthermore, this hybrid was able to fold and form a stable triple- 
helix as judged by the formation of a protease-resistant fragment (f igure lane 3). 
Proa2(I):I UllI)B(iR w contains a Ser— >Cys substitution which enabled the inventors 
to assay for the formation of disulfide-honded trimcrs. Previous data demonstrated 
that this substitution alone does not enable wild-type prou2(I)Al claims to form 
homotrimers (Lees and Bulleid. Nevertheless, to eliminate the possibility that 

this mutation iniluences the assembly pattern a revertant prou( I ):( 1 1 1 )B(iR v " ' which 
contains the wild-type complement of Cys residues was created As expected 
pro(x2( I ):(III )B( iR c " s was unable lo form disulfide-honded trimers (figure 10. lane 5) 
hut did assemble correctly into a protease-resistant triple helix (f igure 11. lane 3). 
Thus, the 23-residuc B-d motif contains all of the information required to direct 
procollagen selt-assembl> . 

The ability of the prou2(l ):(1II )BGR S " chains to form interchain disulfide 
bonds suggests that this molecules is able to associate via its ('-propeptide. However, 
to confirm that this is indeed the case the inventors carried out a collagenase digestion 
of the products of the translation (f igure 12). Bacterial collagenase specificalh 
digests the triple-helical domain, leaving both the N- and C- propeptides intact The 
N-propeptides of both chains do not contain any methionine residues and as a 
consequence, the only radio labelled product remaining after digestion is the ( - 
propeptide Comparison of the samples separated under reducing and non-reducing 
conditions demonstrated that inter-chain disulfide-honded trimers were formed within 
the ('-terminal propeptide domains of prou 1 (III )A 1 and proa2( 1 ):( I IhBGR " chains 
( figure 1 2, lanes 2 and 4. and 3 and 5 ). This demonstrates that these chains do indeed 
associate via their C-terminal propeptide domains. 



WO 98/38303 




PCT/GB98/00468 



27 

2.4 The effect of Leu— > Met substitution on prou2(J):BGR assembly 

Analysis of the 2? amino acid B-G motif from the proulUIl) and prou.2(h 
chains (Figure 13) indicates thai residues 13-20 (OLAFLRLL) are identical with the 
exception of position 17, Leu (I ) in proa 1 (III ) and Met (M) in proa2(h. I "sing site- 
directed mutagenesis the inventors substituted the existing Leu residue with Met to 
create prou2(l ):(III)BGR lM and monitored the effect of this mutation on chain 
assembly. The Leu— >Mci mutagenesis was performed using recombinant 
prou(l):(IlI)B(iR s * L and prou2( 1 ):( 111 )BGR ln ' and were able to form interchain 
disulfidc-bonded molecules when analysed under non-reducing conditions ( f igure 10. 
lanes 4 and 6). Both constructs lormed protease-resistani triple-helical domains 
(figure 1 1. lanes 1 and 3i 1 he Leu ■-►Mel substitution did not. therefore, disrupt the 
process of chain selection noi did it prevent the formation of a correctly aligned triple- 
helix. These observations lead to the conclusion thai a discontinuous sequence of 15 

amino acids: {GNPLLPR DYI I )\ SSR) contains all of the information necessar> 

to allow procollagen chains to discriminate between cacti othei and assemble in a 
type-specific manner. 

3. DISCISSION 

The molecular mechanism w hich enables closely related procollagen chains to 
discriminate between each other is a central lealure of the assemhh pathway The 
initial interaction between the < -terminal propeptide domains both ensures that the 
constituent chains are correct!;, aligned prior to nucleahon ot the inpic-hch\ and 
propagation in a C- to N- direction, and that component chains associate in a collagen 
type-specific manner. As a consequence, recognition signals which determine chain 
selectivity are assumed to reside within the primary sequence of this domain., 
presumably within a regionis} of genetic diversity. By generating chimeric 
procollagen molecules from parental 'mini-chains' proa 1( III )A1 and pro(x2( 1 )A 1 the 
inventors have demonstrated that transfer of the proul(III) ('-terminal propeptide 
domain to the naturally hetrotrnneric proa2(I) molecule was sufficient to direct 
formation of homotrimers. f urthermore, analysis of a series of molecules in which 
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specific sequences were interchanged from proul(III) and proa2(I) ('-terminal 
propeptide domains allowed the inventors to identify a discontinuous sequence of 15 

amino acids (GNPKLPEDVLDV SSR) within the proul(III) ('-propeptide, which. 

if transferred to the corresponding region within the prouHIII) recognition motif to 
the pro(x2('l) chain did not appear to have an adverse effect on chain alignment, 
allowing the triple-helical domains to told into a protease-resistant confirmation. This 
sequence motif is. therefore, both necessary and sufficient to ensure that procollagen 
chains discriminate between each other and assemble in a type-specific manner. 

In order to establish a structure-function relationship for the chain recognition 
domain, the inventors examined the hydropathy profile and secondary structure 
potential of the 2 -residue B-(i sequence : ( iNPHLPR DYI.DYOL AI LRLf SSR. The 
data indicate that the 1 5-residue chain recognition motif: ( iNPRLPHDYLI)Y....SSR is 
markcdl) hydrophilic, in contrast to the hydrophobic properties of the conserved 
region: ORAI LRLLL. These features are entireh consistent with a potential role for 
this motif in mediating the initial association between the component procollagen 
monomers. An examination of the 1 5-residue recognition motif from other fibrillar 
procollagens predicts that they are all relatively hyrophilic and probably assume a 
similar structural conformation, regardless of the degree of diversity in the primary 
sequence (figure 13). It is, presumably, the nature of the amino acids changes which 
provides the distinguishing topographical features necessar> to ensure differential 
chain association An examination of the B-(i sequence alignment (figure 13) 
indicates that residues 1, 2. 12 and 21 are more tightly conserved that amino acids 3- 
1 1. 22 and 23. suggesting that the latter ma> form a core recognition sequence that in 
of critical importance in the selection process. We do not know whether the other 
four residues participate directly in chain discrimination but this can be tested 
experimentally by site-directed mutagenesis. 

The inventors have identified the functional domain which determines chain 
selectivity and show that trimerization is initiated via an interaction! s) between these 
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identified recognition sequences. It is unclear, however, whether the interactions 
which determine chain composition are the same as those which allow productive 
association and stabilization of the trimer. The nature of potential stabilizing 
interactions is uncertain, but recent data (Bulleid ct al^ 1W6) indicate that, for type III 
procollagen at least, the formation of interchain disulfide bonds does not play a direct 
role in procollagen assembly It has also been postulated that a cluster of four 
aromatic residues, which are conserved in the fibrillar collagens. collagens X. VIII 
and collagen like complement factor Clq, may be of strategic importance in 
tnmerization. 

The C-telopeptides were originally proposed to have a role in both procollagen 
assembly and in chain discrimination, the latter by virtue of the level of sequence 
diversity between various procollagen chains. However, the inventors have recently 
demonstrated (Bulleid t7 <//.. 1 W(>) thai the C-telopeptides of type III collagen do not 
interact prior to nucleation of the triple-helix, ruling out a role for this peptide 
sequence in ihc initial association of the C-propeptides. Data obtained from the 
assembly of hybrid chains indicates that the ability to discriminate between chains 
does not segregate with the species of C-telopeptide. lending support to this assertion. 

Using this approach the inventors have been able to synthesize an entirely 
novel procollagen species compromising three prou2(l)Al chains | protx2( I )A 1 ] , 
Throughout this study procollagen 'mini-chains' with truncated triple-helical domains 
were used; however, the inventors have also demonstrated that full-length pro</.2(h 
chains containing the 15-residue proix1(IIh recognition sequence also self-associalc 
into a triple-helical conformation (data not shown). Thus, the ability to introduce the 
chain recognition sequence into different pro-u chains provides the means to design 
novel collagen molecules with defined chain compositions. Tins, in turn, introduces 
tiie possibility of producing collagen matrices with defined biological properties, such 
as enhanced or differential cell-binding or adhesion properties, furthermore, the 
identification of a short peptide sequence which directs the initial association between 
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procollagen chains may provide a target tor therapeutic intervention allowing for the 
modulation or inhibition of collagen deposition. 

The chimeric constructs described above may be tised in the method of the 
present invention to allow the expression of exogenous procollagens in anv cell-line 
without the problems associated with co-assembly with endogenouslx expressed 
procollagen. T he uses of the methods of the invention are to express procollagen m 
cells either grown in culture or within tissues of the body. '1 his will be of particular 
relevance for the production of recombinant procollagen in ecd-lmes such a> 
fibroblasts which normally efficient!) s\nthesis fibrillar eollagetis aiui in the treatment 
ol collagen diseases by gene therapy. 



